Influence of the Sparse Matrix Structure on Automatic Parallelisation Efficiency
نویسندگان
چکیده
The simulated models and requirements of engineering programs like computational fluids dynamics and structural mechanics grow more rapidly than single processor performance. Automatic parallelisation seem to be the obvious approach for huge and historic packages like PERMAS. In this paper we evaluate how preparatory steps on the big input matrices can improve the performance of the parallelisation. We show that a preparatory blocking of the matrix saves storage and decreases the critical path length of the task graph when it is done with variable sized blocks. Also, a data distribution step is proposed that drives the modified dynamic scheduler. Results of this combination show an efficient parallelisation of the programs even on slow multiprocessor networks. Finally, the last step proposed is to interleave the array blocks that are distributed to different processors with post-ordering algorithm. This step is essential to expose the parallelism to the scheduler.
منابع مشابه
On the generic parallelisation of iterative solvers for the finite element method
The numerical solution of partial differential equations frequently requires solving large and sparse linear systems. When using the Finite Element Method these systems exhibit a natural block structure that is exploited for efficiency in the “Iterative Solver Template Library” (ISTL). Based on existing sequential preconditioned iterative solvers we present an abstract parallelisation approach ...
متن کاملVoice-based Age and Gender Recognition using Training Generative Sparse Model
Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...
متن کاملUCHPC – UnConventional High Performance Computing for Finite Element Simulations
Processor technology is still dramatically advancing and promises enormous improvements in processing data for the next decade. These improvements are driven by parallelisation and specialisation of resources, and ‘unconventional hardware’ like GPUs or the Cell processor can be seen as forerunners of this development. At the same time, much smaller advances are expected in moving data; this mea...
متن کاملParallel Geometric Multigrid
Multigrid methods are among the fastest numerical algorithms for the solution of large sparse systems of linear equations. While these algorithms exhibit asymptotically optimal computational complexity, their efficient parallelisation is hampered by the poor computation-to-communication ratio on the coarse grids. Our contribution discusses parallelisation techniques for geometric multigrid meth...
متن کامل5 Parallel Geometric Multigrid
Multigrid methods are among the fastest numerical algorithms for the solution of large sparse systems of linear equations. While these algorithms exhibit asymptotically optimal computational complexity, their efficient parallelisation is hampered by the poor computation-to-communication ratio on the coarse grids. Our contribution discusses parallelisation techniques for geometric multigrid meth...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999